618 results found.
Speech, noise and room impulse response data,
Language Type:
Multilingual
Languages:
English German Spanish
Availability:
via request, maybe public in the future?
License:
Creative Commons Attribution-NonCommercial 4.0 International Public License
Size:
2.3Gbyte OtherProduction Status:
released
Use:
Development of speech-enhancement algorithms
-
Paper title:Optimization and evaluation of an intelligibility-improving signal processing approach (IISPA) for the Hurricane Challenge 2.0 with FADE
-
Paper track:13.4 Intelligibility-enhancing Speech Modification/Oral Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marc René Schädler | Hurricane Challenge 2.0 development set | /N |
Documentation:
Yes, english
pos-tagging
Bilingual corpora from Europarl (Koehn, 2005),
Language Type:
Multilingual
Languages:
English French German
Availability:
License:
Size:
2M tokens Production Status:
Use:
Machine Translation, contratsive analysis
-
Paper title:The Learnability of the Annotated Input in NMT Replicating (Vanmassenhove and Way, 2018) with OpenNMT
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Nicolas Ballier | Annotated Europarl | /N |
Documentation:
Koehn 2005 paper
Written
Second language learner corpus,
Language Type:
Multilingual
Languages:
Czech German Italian
Availability:
Publicly available
License:
CC BY-SA 4.0
Size:
2,286 texts Production Status:
Finished
Use:
-
Paper title:Reproducing Monolingual, Multilingual and Cross-Lingual CEFR Predictions
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yves Bestgen | MERLIN corpus | /N |
Documentation:
See https://merlin-platform.eu/index.php
,
Language Type:
Multilingual
Languages:
Czech German Italian
Availability:
Freely Available
License:
<Not Specified>
Size:
2,286 texts Production Status:
Newly created-finished
Use:
resource for language learning and teaching
-
Paper title:Reproduction and Replication: A Case Study with Automatic Essay Scoring
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eva Huber | MERLIN Corpus | /N |
Documentation:
None
Written
Terminology,
Language Type:
Multilingual
Languages:
Arabic Dutch English French German Modern Greek Russian Spanish
Availability:
Freely Available
License:
Size:
4473 concepts Production Status:
Existing-updated
Use:
Acquisition
-
Paper title:Representing Multiword Term Variation in a Terminological Knowledge Base: a Corpus-Based Study
-
Paper track:Terminology/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Pilar León-Araúz | EcoLexicon | /N |
Documentation:
https://ecolexicon.ugr.es/en/manual.htm
Written
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
CLARIN ACA+BY+NORED (EULA)
Size:
556185 tokens Production Status:
Existing-updated
Use:
Political Science
-
Paper title:DEbateNet-mig15:Tracing the 2015 Immigration Debate in Germany Over Time
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Gabriella Lapesa | DEbateNet-mig15 | /N |
Documentation:
Yes, see below
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English French German Japanese Korean Russian Spanish
Availability:
Freely Available
License:
CC-BY-4
Size:
68000000 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Felipe Soares | ParaPat | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
Not Applicable
License:
Size:
1.7 billion words Production Status:
Newly created-in progress
Use:
Hate Speech research
-
Paper title:An Annotated Social Media Corpus for German
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eckhard Bick | XPEROHS corpus (German section) | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German Italian Spanish
Availability:
Freely Available
License:
Creative Commons
Size:
59364453 sentences Production Status:
Newly created-finished
Use:
Word Sense Disambiguation
-
Paper title:Sense-Annotated Corpora for Word Sense Disambiguation in Multiple Languages and Domains
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bianca Scarlini | OneSeC | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese Dutch French German Italian Mongolian Persian Russian Spanish Swedish Turkish
Availability:
Freely Available
License:
CC0
Size:
700 hours Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Changhan Wang | CoVoST | /N |
Documentation:
https://github.com/facebookresearch/covost




